***** REPLICATION FILE for Stimulant or depressant? Resource-related income shocks and conflict***
****  Authors: Kai Gehring, Sarah Langlotz, Stefan Kienberger ************************************
**** E-Mail: sarah.langlotz@uni-goettingen.de*****************************************************
*** Review of Economics and Statistics************************************************************
**************************************************************************************************
A] Folder description 

1) After downloading the data from the Harvard Dataverse, create the following three folders:
	- "processed"
	- "graphs"
	- "tables"

2) Follow the instructions in dataprep_GLK.do and execute the file.

3) Follow the instructions in analysis_GLK.do and execute the file to create the tables and graphs in the paper.

**************************************************************************************************
B] Data dictionary

Variable names (as created and used in analysis.do) in parentheses. Source file name at the end of the description in "". It is indicated whether the source data is uploaded in the Harvard Dataverse or cannot be shared. Several data were generated using ArcGIS. We share the ArcGIS output in the Harvard Dataverse.

Any Processing Lab (anylab): 
We count all types of heroin laboratories. This variable takes on the value 1 if there is at least one lab in a district i, and 0 otherwise. As described in Appendix H, we geo-reference maps from UNODC reports regarding drug markets, labs, and trafficking routes, assign coordinates to the labs, and later compute district averages. 
Source: UNODC (2006/07, 2014, 2016) processed in ArcGIS. 
File generated using ArcGIS: "Districts_UNODC_Data.xlsx" (shared in the Harvard dataverse)

Any Military Base (time_camp_1, time_camp_2, time_camp_3): 
Used to compute a proxy for government control (distance to foreign military base). This variable takes on the value 1 if there is at least one open military base in a district i in year t, and 0 otherwise. The approach is described in detail in Appendix H, page 5. Note that we are most likely not capturing all existing locations, as we did not receive the exact information about opening and closing for all military bases. Opening and closing dates were coded with the available information; if there was no information about shutting down a base we assume it is still active. 
Source: For the more well-known bases, we use Wikipedia’s GeoHack program; for the less well-documented bases, we use Wikimapia and Google Maps satellite data. The data was processed in ArcGIS and used to compute distances. 
File with the final list and coordinates of the bases (generated using ArcGIS): "Military Bases and Camps.xlsx" (shared in the Harvard dataverse). 
File with the distances (also generated using ArcGIS): "distances.csv" (shared in the Harvard dataverse).

(Log) Battle-Related Deaths (lnbrd) and conflict types (smallconflict, lowconflict, conflict, war): 
This variable measures the best (most likely) estimate of total fatalities resulting from an event, with an event being defined as “[an] incident where armed force was [used] by an organized actor against another organized actor, or against civilians, resulting in at least 1 direct death at a specific location and a specific date.” A direct death is defined as “a death relating to either combat between warring parties or violence against civilians.” Note that the Uppsala Conflict Data Program Georeferenced Event Dataset (UCDP GED) only includes BRD of events that belong to a dyad (“two conflicting primary parties or party killing unarmed civilians”) that reached in total at least 25 BRD within one year. If the dyad generated events with less than 25 BRD in the previous or subsequent years, they are still counted if the dyad had reached the 25 BRD threshold in another year. We construct a continuous measure (log of BRD) and binary outcomes from all BRD of any party or any type of violence (state-based, non-state or one-sided violence). To capture the lowest level of conflict in a binary measure, we classify a district-year observation with at least five BRD small conflict. We then increase the threshold to 10 for the next level of conflict intensity (low conflict). In analogy to the threshold used in macro level analyses, we call a district-year observation conflict if there are more than 25 BRD. At the top, we take a threshold of 100 BRD for the most severe level of violence what we call war. Since UCDP GED provides information on the parties and the type of violence we also construct specific outcome measures according to those categories. Besides different measures of incidence, we also construct measures on onset and ending. We define conflict onset as the incidence of a conflict in a district, where there was no conflict in the previous year. Years of ongoing conflict are set to missing. In analogy, a conflict ending is defined when conflict persisted in the previous year but not in the current year. We also set the ending variable missing for observations which have been at peace in the previous year and remained in peace in the current year, following the standards in the literature. 
Source: UCDP GED (Sundberg & Melander, 2013; Croicu & Sundberg, 2015). "getdistrictGED.csv" and "ged40.xlsx" (shared in the Harvard dataverse).

Calorie Intake (calories_pm): 
We use a questionnaire in which women self-report amounts, frequencies, and sources of a large set of food items, to construct measures on calorie intake and food insecurity. We multiply amounts consumed with kcal values for that food item to get total household calorie intake. Total household daily calorie intake is divided by the number of members that were resident and ate at least dinner regularly in the household during the last seven days to get per capita measures. 
Source: For kcal values, we use the CSO & The World Bank (2011). The questionnaire responses are from the NRVA women’s questionnaire (CSO, 2003, 2005, 2007/08, 2011/12). 
Source files: cannot be shared for proprietary reasons.

Consumer Price Index (used to adjust prices all prices, e.g. food, drugs, other commodities for inflation): 
Source: For the Euro area (19 countries), we draw data from the OECD (2016); for the remaining countries (2010 = 100), we use IMF data (2016). 
Source files: "oecd_CPI.xlsx", "imf_CPI.xlsx" (shared in the Harvard dataverse)

Dietary Diversity (dietarydiversity): 
This variable varies between 0 and 8, with eight indicating a high food diversity. According to Wiesmann et al. (2009, p. 5) “Dietary diversity is defined as the number of different foods or food groups eaten over a reference time period, which in my case is one week, not regarding the frequency of consumption.” We classify the different food items from the survey into eight food groups as explained in Wiesmann et al. (2009). These groups are staples, pulses, vegetables, fruit, meat/fish, milk/dairy, sugar, and oil/fat. 
Source: NRVA (CSO, 2003, 2005, 2007/08, 2011/12). 
Source files cannot be shared for proprietary reasons.

Distance/Proximity/Travel Time to Kabul and Kandahar, Kunduz, Jalalabad, Hirat, and Mazari Sharif (next five largest cities) (totaltime_1, totaltime_2, totaltime_3, totaltime_1_anycity, totaltime_2_anycity, totaltime_3_anycity): 
For the proximity to Kabul and other main cities we define binary indicators for the distance being smaller than 75 km (1 if < 75) or smaller than 100 km (1 if < 100). In analogy to these categories, we construct indicators for the travel time to Kabul or one of the other main cities falling below 2 or 3 h. We use the shapefiles provided by the Afghan statistical authority on the 398 Afghan districts. Note that the shapefiles available at www.gadm.org do not reflect the current status of administrative division in Afghanistan, and instead we use the one from Empirical Studies of Conflict (ESOC Princeton, https://esoc.princeton.edu/data/administrative-boundaries-398-districts). To compute the distances, we first create the centroid of each district polygon. To compute road distances we combined road shapefiles from the official Afghan authorities with street maps from open street map, which were improved by voluntary contributors to close gaps in the official maps. 3D-distances were computed using elevation data from the US Geological Survey (https://www.usgs.gov/centers/eros/science/usgs-eros-archive-digital-elevation-global-30-arc-second-elevation-gtopo30?qt-science_center_objects=0#qt-science_center_objects, accessed July 9, 2018). We add the elevation information to the shapefile containing the roads, and then compute and save three-dimensional distances. We then use the network analyst in ArcGIS to set up a network between all district centroids, clipping centroids that do not overlap with a street in that district that is closest with regard to the as-the-bird-flies distance. Then, we compute the most efficient routes using road distances in two- and three-dimensions. The distances are saved in a matrix and exported in a table that is further processed in Stata. For the variable “distance to other main cities” we use the minimum distance to any of the five cities. For travel time we use the distinction of roads in three classes (motorways, rural, urban), and assign commonly used values for average traveling speed for that road type based on three sources. 
Sources: The first source is UNESCAP (http://www.unescap.org/sites/default/files/2.4.Afghanistan.pdf, p. 14, last accessed August 28, 2019) which assumes that the speed on motorways is 90 km/h and on urban roads 50 km/h. The second source is IRU (https://www.iru.org/apps/infocentreitem-action?id=560&lang=en) which states no limits except for urban areas with 50 km/h. The 3rd source is WHO (http://apps.who.int/gho/data/view.main.51421, last accessed August 28, 2019) reporting 90 km/h for rural. We choose the following average traveling speeds, assuming that no strictly enforced limits and little traffic on motorways (120 km/h), and accounting for some (90km/h-10km/h) and moderate traffic in cities (50-20 km/h). Thus our main choice is the following. Motorways: 120 km/h, rural: 80 km/h, urban: 30 km/h. These choices are not perfect, but we verify that our results hold with other variations as well.
Source files: 
- Files with the temporal and spatial distances generated using ArcGIS: "distances.csv", "Centroid_to_Centroid_Matrix.txt" (shared in the Harvard dataverse).
- ESOC Files generated using ArcGIS: "district-neighbors_codes.xlsx", "esoc.csv" (shared in the Harvard dataverse).

Drug Prices (heroinprice, complementprice, cocaineprice): 
Variables are normalized so that prices vary between 0 and 1. We use data on average prices per gram across all available countries in Europe for the following drugs: amphetamines, cocaine, ecstasy, heroin (brown). To construct the average price of alternative drugs we use a mean of the three stimulant drugs amphetamines, cocaine, and ecstasy. For the analysis we convert all drug prices into constant 2010 euros per gram. We then normalize the prices by using a linear min-max function such that all prices vary between 0 and 1. Some regressions use the logarithm of prices using the transformation of log(price+0.01). 
Source: European Monitoring Center for Drugs and Drug Addiction (EMCDDA). 
Source files: "brownheroinprice.xlsx", "cocaineprice.xlsx", "amphetamineprice.xlsx", "Ecstasyprice.xlsx", "opium_out_weighted.xls" (shared in the Harvard dataverse).

Economically Improved (economicimprove): 
This variable refers to the question “How do you compare the overall economic situation of the household with 1 year ago?” A value of 1 indicates much worse, 2 slightly worse, 3 same, 4 slightly better, and 5 much better. This is a self-reported measure. Source: NRVA (CSO, 2003, 2005, 2007/08, 2011/12). 
Source files cannot be shared for proprietary reasons.

Ethnic Groups (pashtun_ethno, taliban1996): 
We record the presence of ethnic groups on a district level based on the Ethnologue (Gordon, 2005) and the the geo-referencing of ethnic groups’ (GREG, Weidmann et al., 2010) datasets. We use the GIS-coordinates of all ethnic groups. Both datasets are based on the historical distribution of ethnic groups or language diffusion. This is an advantage as it ensures that the boundaries are not endogenous to changes during our period of observation. It is partly a disadvantage if groups and countries changed over time. In Afghanistan, the country boundary did not change. Ethnic group populations certainly change to some degree over time, so that all variables more precisely capture the historic homelands of ethnic groups rather than the current settlement areas. An alternative definition relies on native languages (including Pashto) present in a district using the NRVA 2003 houshold survey, which is available for a subset of districts. An NRVA-based measure of district with a Pashtun majority correlates more strongly with the Ethnologue measures. 
Files generated using ArcGIS: "GREG_ethnologue_comparison.xlsx", "GREG_Ethnicity.xls" (shared in the Harvard dataverse).

Ethnic Trafficking Route (Ethnic_Connections): 
The variable takes on the value of 1 if there is a potential trafficking route leading from a district to at least one unofficial border crossing point without crossing the ethnic homeland of another group. The underlying intuition is that trafficking is cheaper and significantly easier to conduct, and the accruing additional profits are higher, if there is no need to cross the area of other ethnic groups to transport over the border. Source: For data on unofficial border crossings, we used the UNODC; for information about the homelands of ethnic groups, we used the (GREG) dataset (Weidmann et al., 2010). 
Files generated using ArcGIS: "Districts_UNODC_Data.xlsx", "Ethnicities_and_Trafficking.xlsx" (shared in the Harvard dataverse).

Food Expenditures (Paasche/Laspeyres) (hhexp_total_2011, hhexp_total_paasche2011, hhexp_total_lasp2011): 
Precise food amounts were merged with local prices to estimate household food expenditure. We show three food expenditure measures, which are all measured in constant 2011 prices, i.e., prices of the 2011/12 survey wave. Only food items that appear in all three waves are included to build the measure. The first measure “Food Exp. 2011 Prices” does not account for spatial price differences. “Food Exp. 2011 Prices, Paasche” and “Food Exp. 2011 Prices, Laspeyres” adjust for spatial price differences, since households in different districts face different prices. Missing values of district prices are replaced by the province median, which in case of missing values has been replaced by the national median price. For close to all reported food items, prices have been given in the district questionnaire. Prices vary at the district level. Following the literature, we include food items from all possible sources, i.e., purchased food or food in form of gifts etc. Information on food and drinks consumed outside the house (from the male survey section) are also included in the total food expenditure measures (adjusted for inflation and regional price differences depending on the measure). Expenditures are measured in per capita terms by dividing the total household food expenditure with the number of households (resident and ate at least dinner regularly in the household during the last seven days). We use the section on food consumption from the NRVA women’s questionnaire as this section offers precise amounts per food item. Source: NRVA women’s and male’s questionnaire and district questionnaire (CSO, 2005, 2007/08, 2011/12). 
Source files cannot be shared for proprietary reasons.

GDP Deflator and exchange rate: 
We use a GDP deflator for the United States with 2010 as the base year. 
Source: World Bank (2016). For the EU we obtain data from economywatch.org with 2010 as a base year as well. 
Source files: "EU_deflator.xlsx", "GDPdeflator.xlsx", "exchangerates.csv" (shared in the Harvard dataverse).

Insecurity/Violence Shock (foodinsecure): 
The share of sampled households per district that have experienced a shock due to insecurity/violence. At the household level, the variable takes on the value of 1 if the household has experienced an insecurity/violence shock. 
Source: NRVA survey (CSO, 2005, 2007/08, 2011/12). 
Source files cannot be shared for proprietary reasons.

Legal Opioids (lnprescription): 
Used to compute the opium shock variables. Since most single publications do not cover our whole sample period, we want to cross-verify the numbers using a variety of sources. Source: A main source is the US CDC Public Health surveillance report 2017 (https://stacks.cdc.gov/view/cdc/47832, last accessed August 28, 2019). Other important sources were Manchikanti et al. (2012); Kenan & Mack (2012); Dart et al. (2015). 
Source file: "Opioid Prescriptions.xlsx" (shared in the Harvard dataverse)

Local Opium Price (opiumAFGprice): 
We utilize reports of (monthly) province-level dry opium prices by farmers and by traders as well as country-wide yearly data on fresh opium farm-gate prices weighted by regional production. The province-level opium prices of farmers and traders are highly correlated, with a correlation coefficient close to 1 (0.998). The correlation between the country level farm-gate price and the province-level farm-gate price is 0.66, significant at the 1%-level. While the province-level prices are only available from 2006 to 2013 and for a subset of provinces, they are still very helpful in identifying whether international prices are correlated with local prices. We use the country-wide yearly data on fresh opium farm-gate prices in Afghanistan interacted with the suitability as one proxy for opium profitability in our regressions in Table 2, panel A. 
Source: Annual Afghanistan Opium Price Monitoring reports (UNODC). 
Source files: "opiumpriceAFG_yearly.xlsx" (shared in the Harvard dataverse)

Log Opioid Prescription Shock (lnprescription_shock): 
Based on lnprescription and opium suitability. See corresponsing variables for sources and source files.

Log Opium Shock Variables (opiumshock_rw_lnopiumAFGnorm, opiumshock_rw_lnheroinnorm, opiumshock_rw_lncomplementnorm): 
Based on opium suitability and prices as well as heroin and complement prices. See corresponsing variables for sources and source files.

Log Wheat Shock (lnwheat_rw_shock): 
Based on wheat price and wheat suitability. See corresponsing variables for sources and source files. 

Luminosity (nightlight): 
We use this variable as a proxy for GDP and development (Henderson et al., 2012). The yearly satellite data are cloud-free composites made using all the available smooth resolution data for calendar years. The products are 30 arc second grids, spanning -180 to 180 degrees longitude and -65 to 75 degrees latitude. A number of constraints are used to select the highest quality data for entry into the composites: Data are from the center half of the 3000 km wide OLS swaths. Lights in the center half have better geolocation, are smaller, and have more consistent radiometry. Sunlit data and glare are excluded based on the solar elevation angle, Moonlit data based on a calculation of lunar illuminance. Observations with clouds are excluded based on clouds identified with the OLS thermal band data and NCEP surface temperature grids. Lighting features from the aurora have been excluded in the northern hemisphere on an orbit-by-orbit manner using visual inspection. The data was processed using ArcGIS. 
Source: Version 4 DMSPOLS nighttime lights time series, National Oceanic and Atmospheric Administration-National Geophysical Data Center (NOAA/NGDC, https://www.ngdc.noaa.gov, last accessed August 28, 2019). We take the logarithm. 
See files starting with "F1" (created using ArcGIS, shared in the Harvard dataverse).

Markets (Major/Sub) and Sum of all Markets (market, total_markets): 
The first variable takes on the value 1 if there is at least one major or sub-market in district i, and 0 otherwise. The second variables counts the sum of all opium markets in a district (both sub and major). 
Source: UNODC reports on drug markets, labs, and trafficking routes (e.g., UNODC 2006/07, 2014, 2016). 
Source file: "Districts_UNODC_Data.xlsx" (shared in the Harvard dataverse).

Market Access (ma_totmarkets_3D, ma_totmarkets_2d, ma_nightlight_2d, ma_nighlight_3d):
We define market access using different parameters such as the importance of districts proxied using either the number of opium markets or mean luminosity (or population). The distance between the district and the other districts as well as a factor dicscounting for district that are further away. We use a factor of 1, as in Donaldson & Hornbeck (2016). To take account of the topography and mountainous terrain in Afghanistan, we compute distances using the two-dimensional road network (Market Access 2D) as well as a three-dimensional road network when adjusting for elevation (Market Access 3D). 
Files created using ArcGIS: "Districts_UNODC_Data.xlsx", "Centroid_to_Centroid_Matrix.txt" and see the files starting with "F1" (shared in the Harvard dataverse).

Mixed/Taliban Territory 1996 (taliban1996): 
The binary indicator on Taliban Territory that we create takes on the value 1 if a district belongs to the territory that was occupied or under the control of the Taliban in 1996, and 0 otherwise. A second indicator (Taliban Territory 1996 - No North) takes on a value of 1 if the district is exclusively occupied by the Taliban and is characterized by no presence of the Northern Alliance. We use an existing map which indicates the territory of the Taliban in 1996 as well as the territory of other major groups of the Northern Alliance (Dschunbisch-o Islami, Dschamiat-i Islami, Hizb-i Wahdat). We geo-referenced the map and aligned it with the district boundaries; in many cases, the division was quite clearly aligned or overlapping with a district boundary, in the other cases we chose the closest district boundary. We classify a district as a Mixed Territory if it is part of the Taliban 1996 territory and part of the territory of any of the three groups belonging to the Nothern Alliance. Source: The map is from Dorronsoro (2005), and more details can be found in Giustozzi (2009). 
File created using ArcGIS: "GREG_Ethnicity.xls" (shared in the Harvard dataverse).

(Log) Opium Cultivation and Revenues (cultivation, lncultivation, revenue, lnrevenue): 
These variables measures opium cultivation in hectares and (log) Revenue using local price for Fresh opium farm-gate prices at harvest time. Data at the district level is an estimate from the data at the province level. We use logged values for opium cultivation and for revenues. From opium cultivation and the respective yields we were able to calculate actual opium production at the district-year level. We also constructed opium revenues by multiplying opium production in kg with the fresh opium farm-gate prices at harvest time in constant 2010 EU/kg. 
Source: Annual Opium Poppy Survey (UNDCP, 2000) and Afghanistan Opium Survey (UNODC, 2001-2014). 
Source files (the last being created using ArcGIS): "2002_2003_OpiumYield.xlsx", "2002_OpiumYield.xlsx", "opiumpriceAFG_yearly.xlsx", "OpiumSurvey2015.csv", "opium_prices_farmer_and_trader.xlsx" (shared in the Harvard dataverse).

Opium Suitability (suitability_rw_opium): 
This is an index with possible values ranging between 0 and 1 which acts as a proxy for potential of opium production based on exogenous underlying information about land cover, water availability, climatic suitability, and soil suitability. The environmental as well as climatic suitability to cultivate opium poppy (Papaver somniferum) is characterized by different factors such as the prevailing physio-geographical and climatic characteristics using climatic suitability based on the EcoCrop model from Hĳmans et al. (2001). The factor determined to be most important by experts is land cover (S1, 0.41 – the sum of the weights equals 1.0), followed by water availability (S2, 0.28) and climatic conditions (S3, 0.21) respectively. This is in line with additional studies previously carried out by UNODC and described in the World Drug Report (2011) for Myanmar. The data and the index itself was modeled on a 1km2 resolution and then aggregated to the district units by an area weighted mean approach. The original indicator values were normalized using a linear min–max function between a possible value range of 0 and 100 to allow for comparison and aggregation. Only the land cover indicator was normalized integrating expert judgments through an Analytical Hierarchy Process (AHP) approach. The four indicators were then subsequently aggregated applying weighted means (weights were verified through expert consultations building on the AHP method). None of the input factors constituting the index is itself to a major degree affected by conflict, which is the outcome variable. Consequently, the index values by district can be considered as exogenously given. We weight the opium and wheat suitabilities with the (lagged) population distribution within the districts. This is helpful as, for instance, the south features large desert areas and at the same time concentrated areas with dense population, and accounting for the suitability in uninhabited desert areas might be misleading (although our results are not significantly affected by this choice). 
Source: The index was developed in the context of a study in collaboration with UNODC; and is described in detail in a publication in a geographical science journal (Kienberger et al., 2017). 
Files created using ArcGIS): "opium_suitability.xls", "opium_suitability_weighted.xls" (shared in the Harvard dataverse).

Pashtun (pashtun_ethno): 
Our binary indicator takes on the value 1 if Pashtuns are present to any degree in a district i, regardless of whether they were the majority group, and 0 otherwise. Our main measure relies on Ethnologue. An alternative measure uses GREG. 
File created using ArcGIS: "GREG_ethnologue_comparison.xlsx" (shared in the Harvard dataverse).

Population (used for weighting opium and wheat suitability): 
This is a minimally-modeled gridded population data that incorporates census population data from the 2010 round of censuses. Population estimates are derived by extrapolating the raw census estimates to a series of target years and are provided for the years 2000, 2005, 2010, 2015, and 2020. We use the interpolated data from 2000 till 2015. We then take the logarithm. 
Source: The Center for International Earth Science Information Network - CIESIN - Columbia University. 2016. Gridded Population of the World, Version 4 (GPWv4): Administrative Unit Center Points with Population Estimates. Palisades, NY: NASA Socioeconomic Data and Applications Center (SEDAC) http://dx.doi.org/10.7927/H4F47M2C, last accessed August 28, 2019. 
Files created using ArcGIS: "pwpop2000.xls", "pwpop2005.xls", "pwpop2010.xls", "pwpop2015.xls" (shared in the Harvard dataverse).

Ruggedness: 
We calculate the average ruggedness index for every district. While ruggedness refers to the variance in elevation, we also use raw elevation data. 
Source: Elevation data from NASA Shuttle Radar Topography Mission (SRTM) data set. The data on terrain ruggedness is the same that was used in Nunn & Puga (2012), although we use it on a more disaggregated level. The dataset and a detailed documentation are available at http://diegopuga.org/data/rugged/, last accessed August 28, 2019. 
Files created using ArcGIS: "NunnPuga_ruggedness_fromarcgis.csv" (shared in the Harvard dataverse).

Significant activities (SIGACTS) to verify the main results (XXX has been moved to the appendix, delete here?): 
These variables measure SIGACTS (Significant Activities) in a given district based on military reports. The SIGACTS version we obtained includes the sum of events per district-year and allows to distinguish events according to the type of (i) combat, (ii) target (forces), and (iii) technology. Regarding (i) and (iii), SIGACTS are defined as direct fire (DF), indirect fire (IDF), and improvised explosive devices (IED). DF attacks can be defined as close combat events that are characterized by the use of weapons like small arms or rocket-propelled grenades. IDF attacks, including mortars and rockets, can be heard within a large area, but are less precise when being launched from great distances. While DF and IDF involve fighters, IEDs involve less risk for the perpetrators. IEDs can be placed around roads and directed against moving targets, for instance pro-government convoys. While DF and IDF are less capital-intensive, IED events capture more capital/technology-based conflict. We can also differentiate between events that are related to casualties and those not related to casualties. Regarding (ii) we can distinguish between actors involved, i.e. whether Afghan or coalition forces have been involved in the event. 
Source: We use data from Shaver & Wright (2016). 
Source files cannot be shared for proprietary reasons.

Sum of Assets (weighted) (sumassets, sumassets_weighted): 
The number of assets the households possess over a set of assets that is constant over 3 survey waves. This set consists of Radio/Tape, Refrigerator, TV, VCR/DVD, Sewing Machine, Thuraya (any phone), Bicycle, Motorcycle, Tractor/Thresher, and Car. Sum of Assets weighted is the sum of asset weighted by the proportion of households not possessing the specific item. 
Source: NRVA (CSO, 2003, 2005, 2007/08, 2011/12). 
Source files cannot be shared for proprietary reasons.

Vegetation Health Index (vhi): 
We compute an index that captures inter-annual variations in drought conditions, the vegetation health index (VHI) of FAO (Van Hoolst et al., 2016). VHI is a composite index joining the Vegetation Condition Index (VCI) and the Temperature Condition Index (TCI, Kogan 1995). Low values of VHI represent drought conditions. This is a combination of low values of the observed VCI (relatively low vegetation) and higher values of the TCI (relatively warm weather). For details see Van Hoolst et al. (2016). The VHI is calculated from data of Advanced Very High Resolution Radiometer (AVHRR) sensors on board of the National Oceanic and Atmospheric Administration (NOAA) and Meteorological Operational Satellite (METOP) satellites. It is superior to simply using precipitation data, which do not directly measure drought conditions, require assumptions about the linearity of the effect and, in particular in Afghanistan, have severe limitations in terms of quality and resolution. The index is based on earth observation data and is available on a monthly basis with a resolution of 1 km2. As cultivation and harvest times differ within Afghanistan, we use the yearly average. The remote sensing based index is operationally used to monitor drought conditions in the Global Early Warning System (GEWS), low VHI values indicate drought conditions. For a similar approach to a VHI, see Harari & La Ferrara (2018). 
See the files starting with "VHI", generated using ArcGIS, "Outputprio.xls" and "PRIO-GRID_1997-2014.csv" for the source data (shared in the Harvard dataverse).

Wheat Price (International) (lnwheatpricenorm): 
Prices are period averages in nominal US dollars with 2005 as the baseline. We use benchmark prices, representative of the global market. They are determined by the largest exporter of a given commodity. 
Source: International Monetary Fund (IMF) Primary Commodity Prices database (IMF, 2005-2017, http://www.imf.org/external/np/res/commod/index.aspx). 
Source files: "IMF_internationalprices.xls" "FAO_barley1991_2014.csv" (FAOSTAT Date: Mon Sep 05 15:05:19 CEST 2016), "FAO_maize1991_2014.csv" (FAOSTAT Date: Mon Sep 05 15:10:30 CEST 2016), "FAO_potato1991_2014.csv" (FAOSTAT Date: Mon Sep 05 15:08:45 CEST 2016), "FAO_rice1991_2014.csv" (FAOSTAT Date: Mon Sep 05 15:09:07 CEST 2016), "FAO_wheat1991_2014.csv" (FAOSTAT Date: Mon Sep 05 15:04:01 CEST 2016) (shared in the Harvard dataverse).

Wheat Suitability (suitability_rw_wheat): 
Seven different soil quality ratings (SQs) are calculated and are combined in a soil unit suitability rating (SR, %). The SR represents the percentage of potential yield expected for a given crop/Land Utilization Type (LUT) with respect to the soil characteristics present in a soil map unit of theHWSD and is depending on input/management level. The FAOGAEZ (2012) model provides for each crop/LUT a comprehensive soil suitability evaluation for all the soil units contained in the Harmonized World Soil Database (HWSD). This is done by the use of individual soil quality ratings. Source: Global Agro-ecological Zones (GAEZ v3.0) by the Food and Agriculture Organization of the United Nations (FAO-GAEZ 2012). Details are provided on http://www.fao.org/nr/gaez/about-data-portal/agricultural-suitabilityand-potential-yields/en/, last accessed August 28, 2019. Go to the section “Agro-ecological suitability and productivity” to find the suitability we use and access the data portal for downloads. 
See the files "sxir_wpo.xls", "opium.xls", "opium_weighted.xls", "scii_brl.xls", "scii_mze.xls", "scii_rcw.xls", "scii_whe.xls", "scii_wpo.xls", "siir_brl.xls", "siir_mze.xls", "siir_rcw.xls", "siir_whe.xls", "siir_wpo.xls", "suii_brl.xls", "suii_mze.xls", "suii_whe.xls", "suii_wpo.xls", "sxir_brl.xls", "sxir_mze.xls", "sxir_whe.xls" for files on soil suitability, generated using ArcGIS (shared in the Harvard dataverse).


**************************************************************************************************
C] Software version 

- Operating System: 
	- Processor: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz   2.80 GHz
	- Installed RAM: 32.0 GB (31.7 GB usable)
	- System type: 64-bit operating system, x64-based processor
	- Edition: Windows 11 Enterprise
	- Version: 21H2
	- Intalled on: 16/02/2023

- STATA version 17.0 base 20apr2021 updated to 10jan2023



